Canalizing Kauffman networks: Non-ergodicity and its effect on their critical behavior 
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Boolean Networks have been used to study numerous phenomena, including gene regulation, 
neural networks, social interactions, and biological evolution. Here, we propose a general method 
for determining the critical behavior of Boolean systems built from arbitrary ensembles of Boolean 
functions. In particular, we solve the critical condition for systems of units operating according to 
canalizing functions and present strong numerical evidence that our approach correctly predicts the 
phase transition from order to chaos in such systems. 
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Biological and social systems typically comprise a large 
number of interacting units coupled through a nontrivial 
network of interactions. Examples of such systems in- 
clude the metabolic processes in living cells Q and social 
interactions in human groups 0, 0] . Remarkably, these 
systems exhibit a high degree of self-organization that 
ensures their continued functioning and allows them to 
respond to environmental changes. A challenging aspect 
in the study of complex systems is how to model both the 
diversity of the evolving units and the intricate structure 
of their interactions 0. 

Discrete (agent-based) models are among the most 
common methods used to tackle this challenge. In partic- 
ular, Boolean networks (BNs) have bee n used to model 
systems as varied as gene regulation networks [j| , evolu- 
tion 0, and neuronal networks Q — see for a review of 
BN and their applications. It has been shown that BNs 
share many common properties with real systems 0, @ , 
the most remarkable probably being a transition from an 
ordered to a chaotic phase. 

A BN consists of N interacting units whose states Gi 
are binary variables. Each unit i is connected to fcj other 
units and its state is updated according to a specific rule 

a i (t+l) = F i [a il (t),a i2 (t),...,a iki (t)}, (1) 

where Fi is a Boolean function, and the {erj . } are the 
states of the units connected to i, which may or may not 
include i itself. Boolean functions are represented by a 
truth table that lists the output of the function for each 
of the possible set of input values. For a function with 
k variables there are 2 k possible input set, yielding 2 2 
different possible functions. 

The ensemble of functions £ defines the probability 
with which each function appears in the system. In the 
original formulation, BNs have the coupling connections 
chosen at random and the Boolean functions Fi drawn 
from an ensemble £ ra nd(p), where p is the fraction of 
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active states in the output of the functions. In the fol- 
lowing we will refer to p as the "bias" , although the case 
p = 0.5 is actually unbiased. This instance of the model is 
usually denoted Kauffman networks or random Boolean 
networks (RBNs). 

Typically, BNs display a transition from order to chaos. 
In the ordered phase, the network evolves toward limit- 
ing cycles and, upon a perturbation, the system usually 
converges back to the initial limiting cycle. In the chaotic 
phase, the lengths of the attractor cycles grow exponen- 
tially with N and almost any perturbation will drive the 
system toward a different attractor. The critical behavior 
of RBNs has been determined by means of several differ- 
ent techniques [ToL ITlj . Not much, however, is known 
about the critical behavior of BN with other ensembles 
of functions. 

In a recent paper, Shmulevich and Kauffman |12| sug- 
gested that the dynamical behavior of a BN can be re- 
lated to the "average influence" of the variables of its 
Boolean functions. Here, we use the concept of dam- 
age spreading to demonstrate the role of the influence 
in the dynamical behavior of BNs. We show that, since 
BNs are nonlinear models not likely to have ergodic dy- 
namics, a naive average of the influence over the whole 
phase-space of BNs does not necessarily yield a correct 
estimate of the effective influence of the Boolean vari- 
ables. We thus revise the definition of average influence 
in order to account for the non-ergodicity of the dynam- 
ics of BNs. Our definition enables us to derive the critical 
condition of networks of canalizing Boolean functions, a 
case of particular biological relevance 0, . Finally, 
we show numerical evidence that our method correctly 
predicts the critical condition for networks of canalizing 
Boolean functions. 

The dynamics of BNs can be quantified by measuring 
the spread of "damage" through the network. This is 
done by comparing the parallel evolution of two "repli- 
cas" of the system. The replicas have identical Boolean 
functions and coupling connections, but the initial state 
of the units in the replicas differs in only a small frac- 
tion of the units. The damage, which is also known as 
the Hamming distance h(t), is defined as the fraction of 
units that are in different states in the two replicas. If, 
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FIG. 1: The critical condition for Boolean networks (BN). 
The Hamming distance h(t) is defined as the normalized dif- 
ference between two replicas of a BN at time t. We show in 
this figure the iterative mapping for the Hamming distance 
for the case of random BNs with bias p = 0.6 The solid 
line corresponds to a network with connectivity k = 4 and 
the dashed line to k = 2. The symbols indicate the Ham- 
ming distances obtained numerically for a few time steps in 
the evolution of two biased networks with k = 4 and k = 2. 
The identity mapping, indicated by the dotted line, and the 
arrows are included to illustrate the time evolution of h(t). 
For the case k — 4, the Hamming distance remains at finite 
value, while for k = 2 it tends to zero. The system is in the 
ordered phase when the Hamming distance converges to zero. 
Note that, h = is a fixed point of the mapping, but is unsta- 
ble whenever the iterative mapping at the origin grows with 
a steeper slope than the identity line; cf. Eq. 



after some transient time, the evolving replicas are likely 
to converge to the same state, i.e., h(t) — * 0, then the 
dynamics of the system is robust with regard to small 
perturbations, a signature of the ordered phase. If, how- 
ever, the replicas are likely to never converge, then the 
dynamics is sensitive to small perturbations to the initial 
state, a signature of the chaotic phase. As discussed in 
the caption of Fig. ^ a system is in the ordered phase 
whenever 



dh(t + l) 



dh(t) 



< 1, 



(2) 



Significantly, the susceptibility of unit i to "damage" 
in its neighbors can be related to the influence of their 
variables on F^. One defines the influence Ij(Fi) of the 
jth variable of a function Fi as the probability that the 
function F, changes its value when the value of Oj is 
changed 0, 0|. The average influence of a function 
I{Fi) = p^jIjfFi), and the average influence I{£) of 
an ensemble £ of Boolean functions is I{£) = (I(Fi))£, 
where (■■■)£ indicates an average over the ensemble £ . 

One can generalize this definition to multiple vari- 
ables [H: iW ee / is the average influence of one vari- 
able, is the average influence of two variables, and 



so on. The probability that an arbitrary unit is damaged 
in the next step depends on the number k c i of damaged 
inputs it gets and on the influence j( fc<J ) of kd variables. 
Since the inputs are an arbitrary sample of the entire 
network we can assume that kd follows a binomial distri- 
bution and write the evolution of the Hamming distance 
as 



h(t + l) = l(kd 



[h(t)] k «[l-h(t)] 



k — k d 



(3) 



where 
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the binomial coefficient. Thus, the influ- 



ences /( fed ) determine the shape of the iterative mapping 
of h(t). Inserting Eq. into Eq. (J5J), we have that the 
critical condition depends only on the average influence 
of one variable, 



I(£,k c )k c = 1. 



(4) 



Equation |0J enables us to determine the critical condi- 
tion for BNs with arbitrary ensembles of functions 01 ■ 

This is not trivial, however. The difficulty in using 
Eq. (0} lies in computing the influence of the variables of 
the Boolean functions present in the network. In prin- 
ciple, the influence of the variables can be determined 
by counting in the truth table the number of times that 
by changing the value of only one variable results in a 
change in the value of F. This approach, which was ex- 
plored in [r3 |. implicitly assumes ergodicity, that is, all 
inputs can arise with the same probability during evo- 
lution, and time average over the states visited by the 
network yields the same result as average over the whole 
phase-space. This is an implausible assumption which is 
unlikely to hold for the dynamics of arbitrary BNs. 

In some instances, however, an equaly weighted av- 
erage does yield to correct results. An example is the 
ensamble of RBNs pl|. Note that this does not imply 
that RBNs are ergodic. In fact, the dynamics of BNs 
in general converge to limiting cycles that occupy only a 
fraction of the entire phase-space. To correctly average 
the influence of the Boolean variables, one must measure 
the influence only on those states composing the limiting 
cycles. 

We can verify in which cases an equally weighted av- 
erage can work. If one assumes that the states of the 
neighbors of a unit are not correlated with the state of 
the unit it self (random-graph approximation), it follows 
that the input acting on the unit is a statistical sample 
of the whole network. Thus, the probability of a certain 
input depends on the fraction q of units that are in the 
active state. That is, if the network has a bias toward 
activity, q > 0.5, the inputs with more Is will be more 
frequent than the inputs with more 0s. Therefore, the 
activity q of the network should be taken into account 
when computing the average influence of the BN. The 
reason why a simple average over the whole phase-space 
works in RBNs and a few other cnsambles is that, on 
these networks, the influence docs not depend on q, thus, 
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averaging over the states of the limiting cycles yields the 
same result as averaging over the whole phase-space. As 
we will demonstrate later, this property does not gener- 
ally hold for arbitrary ensembles of Boolean functions. 

In the following, we focus on the ensemble of canaliz- 
ing Boolean functions £ can . Studies of gene regulation in 
eukaryots have showed that the Boolean idealization is a 
good approximation for the non-linear dynamics of this 
system and that the gene regulating mechanisms have a 
strong bias toward canalizing functions [l^. A Boolean 
function is canalizing if whenever one variable, the canal- 
izing variable, takes a given value, the canalizing value, 
the function always yields the same output. The en- 
semble of canalizing functions can be separated into four 
mutually exclusive classes of functions: 
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F(ax,a 2 , •••) 

F((Tl,(X2, ...) 

F(ax,a 2 , ...) 
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a x OR 
(NOT a x ) AND 
(NOT ax) OR 

a x AND 



G(a 2 ,...), (5a) 

G(a 2 , •■■), (5b) 

G(a 2 ,...), (5c) 

G(a 2 ,...), (5d) 



where ax is the canalizing variable, "AND," "OR," and 
"NOT" are the logical Boolean operators, and G is the 
non-canalizing part of the function that carries the de- 
pendence on the remaining variables. Each of this classes 
represents a different type of regulation. The class de- 
scribed by (|5a|l represents "sufficient activators," that is, 
ax = 1 is sufficient to assure an active state for the unit. 
The class described by <|5b(l represents "sufficient repres- 
sors," that is, ax — 1 always results in an inactive state 
for the unit. The classes described by (|5*cjl and l|5d|) rep- 
resent "necessary repressors" and "necessary activators," 
respectively. In these cases, a\ = is also enough to de- 
termine the output of the function. 

The average influence for the ensamble £ can depends 
on the probability P can with which ax takes the canal- 
izing value. For classes described by H5a(l and l|5b(l . 
ax gives a sufficient condition for activation or repres- 
sion respectively. This means that for these classes the 
canalizing value is an active state. On other hand, for 
classes l|5c|l and (|5djl the canalizing value is an inactive 
state. If both cases are equally present on the network, 
one has always P ca n = 0.5. However, if one of the canal- 
izing values is more frequent than the other, P can will de- 
pend on the fraction q of units in the active state. To ac- 
count for this effect we define rj as the fraction of the func- 
tions in the ensemble that fall into classes (|5a|) and 15bt . 
Thus, the probability that the canalizing variable takes 
the canalizing value is 



Pcan = qr} + (1 - q)(l - v). 



(6) 



The next step is determining the average activity q of 
the network when the limiting cycles are reached. To do 
this, we need to define some relevant parameters charac- 
terizing the ensemble of canalizing functions. Note that, 
for the classes described by Eqs. I|5a|l and (|5c|l . the use 
of the "OR" operator means that the values of a\ and G 
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FIG. 2: Order-chaos phase-transition for networks of canaliz- 
ing Boolean functions. The color coding represents the prob- 
ability that, after changing the state of a random unit of a 
network in a limiting cycle, the system returns to the same 
cycle. The resilience of the network to small perturbations is 
a signature of the ordered phase. The parameter r\ gives the 
probability with which a Boolean function in the network is 
canalized by an active input. In the trivial case r\ = 0.5, the 
average influence does not depends on the activity of the net- 
work and the influence can be computed with an average over 
the truth table of the Boolean functions in the network, and 
one finds the critical condition when kl — 1 in Eq. @. In the 
non-trivial case r\ = 0.1 one finds no difference in the value of 
the influence when computed from the truth table of the func- 
tions. However, it is clear that the critical curve is sensible 
to the value of r\. The difference is due to the fact that in the 
latter case the average influence depends on the activity of 
the network and one has to make a weighted average over the 
states occupied by the limiting cycles and the critical condi- 
tion is kl = 1 in Eq. (JHJ. In both cases, as the network grows, 
the transition becomes sharper and approaches the critical 
curves (dotted lines). 



give two alternative conditions yielding an active output, 
while for the classes described by Eqs. I|5b|) and 15d|l . the 
"AND" operator means that the values of a\ and G give 
two necessary conditions for obtaining an active output. 
The use of the "OR" operator thus results in a bias to- 
ward activity. To quantify this bias, we define p\ as the 
fraction of the functions in the ensemble that fall into 
classes (|5a|) and (jSc}. Note that, the bias of the canaliz- 
ing functions toward the active state will also depend on 
G, the non-canalizing part of F. We assume that G is 
chosen as a random Boolean function with bias p 2 . 

It is possible to measure the probability that a random 
input results in an active output. This probability is 
the average bias p can = (px + P2)/2 of the ensemble of 
canalizing functions E can - However, one can not assume 
that in the limiting cycles any input happen with the 
same chance. For the ensemble £ ccm , the average activity 
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for the limiting cycles is given by 

q = PlPcan + P2(l - Pcan), (7) 

where the first term on the right accounts for the prob- 
ability that the function is being canalized to activity 
and the second for the probability that the function G is 
driving the function to activity. 

We can now proceed and calculate the average influ- 
ence for the ensemble of canalizing functions. We will 
consider first only the average influence of the canaliz- 
ing variable I\, which is given by the probability that 
G = when the OR operator is chosen, plus the prob- 
ability that G = 1 when the AND operator is chosen; 
h = — P2) + (1 — P\)Pi- The influence of the 

remaining variables It depends on the probability that 
the functions are not locked by the canalizing variable, 
1 — P can , and on the bias p 2 of G. Finally, we have 

h = 2p 2 (l - j0 2 )(l - Pcan): an d: 

kl = p 1 +p2-2p lP 2+2p 2 (l-p 2 )[r]+q(l-2ri))(k-l). (8) 

If one assumes that all inputs in the truth table con- 
tribute with the same weight to the average, then P can = 
0.5, and 

kI = Pl +p 2 - 2p lP2 + (k— l)p 2 (l - p 2 )- (9) 

One of the cases where Eq. works is when r\ = 0.5. 

We next test our theoretical results against numeri- 
cal simulations of BNs of canalizing functions. This is 
done by building random networks with Boolean func- 
tions obeying the ensemble of canalizing functions de- 
scribed by Eq. J5J. We assign random initial states to the 
networks and let them evolve until they reach a limiting 
cycle • We then make a perturbation by changing the 
state of one of the units in the network. The resilience 
of the system to this "damage" is the probability that, 



after the perturbation, the system converges back to the 
initial limiting cycle We show in Fig. that, as the 
system size grows, the transition from order to chaos be- 
comes sharper and approaches a critical condition where 
kl = 1; cf. Eq. ©■ 

Note that, when the network has a bias in the canal- 
izing value, rj = 0.1, there is a considerable reduction in 
the region occupied by the chaotic phase, mainly in the 
region where the network is biased to the inactive state: 
low p\ and p 2 . This bias for an inactive canalizing value 
was observed in the mechanisms of gene regulation [T^j 
where the transcription of a given gene may depend on 
the presence of several activator proteins, that is, a single 
inactive input — the absence of the one of the activators — 
can result in an inactive state — no transcription. 

The major finding of this study is that, by using the 
concepts of influence of Boolean variables and damage 
spreading, we are able to obtain the critical behavior of 
Boolean networks built from arbitrary ensembles of func- 
tions. We show that for most networks the effective in- 
fluence of the variables cannot be obtained by a simple 
average over the truth table of the functions. We further 
obtain an expression for the influence of the variables 
for networks of canalizing Boolean functions and present 
strong numerical evidence that our method can accu- 
rately predict the critical transition for these networks. 
Our work suggests that the approach described here can 
solve the critical transition of other ensembles of Boolean 
functions such as nested canalizing functions |l4j | — which 
are thought to be a valuable model for the description 
of gene regulation networks — or random threshold func- 
tions — a common model for neural networks. 
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